Probabilistic planning with non-linear utility functions and worst-case guarantees

نویسندگان

  • Stefano Ermon
  • Carla P. Gomes
  • Bart Selman
  • Alexander Vladimirsky
چکیده

Markov Decision Processes are one of the most widely used frameworks to formulate probabilistic planning problems. Since planners are often risk-sensitive in high-stake situations, non-linear utility functions are often introduced to describe their preferences among all possible outcomes. Alternatively, risk-sensitive decision makers often require their plans to satisfy certain worst-case guarantees. We show how to combine these two approaches by considering problems where we maximize the expected utility of the total reward subject to worst-case constraints. We generalize several existing results on the structure of optimal policies to the constrained case, both for finite and infinite horizon problems. We provide a Dynamic Programming algorithm to compute the optimal policy, and we introduce an admissible heuristic to effectively prune the search space. Finally, we use a stochastic shortest path problem on large real-world road networks to demonstrate the practical applicability of our method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sensor Planning with Non-linear Utility Functions

Sensor planning is concerned with when to sense and what to sense We study sensor planning in the context of planning objectives that trade o between minimizing the worst case expected and best case plan execution costs Sensor planning with these planning objectives is interesting because they are realistic and the frequency of sensing changes with the plan ning objective more pessimistic decis...

متن کامل

Worst-Case Portfolio Optimization under Stochastic Interest Rate Risk

We investigate a portfolio optimization problem under the threat of a market crash, where the interest rate of the bond is modeled as a Vasicek process, which is correlated with the stock price process. We adopt a non-probabilistic worst-case approach for the height and time of the market crash. On a given time horizon [0, T ], we then maximize the investor’s expected utility of terminal wealth...

متن کامل

Probabilistic Integrated Planning of Primary and Secondary Distribution Networks based on a Hybrid Heuristic and GA Approach

The integrated planning of distribution system reveals a complex and non-linear problem being integrated with integer and discontinues variables. Due to these technical and modeling complexities, many researchers tend to optimize the primary and secondary distribution networks individually which depreciates the accuracy of the results. Accordingly, the integrated planning of these networks is p...

متن کامل

Online and Random-order Load Balancing Simultaneously

We consider the problem of online load balancing under lp-norms: sequential jobs need to be assigned to one of the machines and the goal is to minimize the lp-norm of the machine loads. This generalizes the classical problem of scheduling for makespan minimization (case l∞) and has been thoroughly studied. However, despite the recent push for beyond worst-case analyses, no such results are know...

متن کامل

To Save Or Not To Save: The Fisher Game

We examine the Fisher market model when buyers, as well as sellers, have an intrinsic value for money. We show that when the buyers have oligopsonistic power they are highly incentivized to act strategically with their monetary reports, as their potential gains are unbounded. This is in contrast to the bounded gains that have been shown when agents strategically report utilities [5]. Our main f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012